Parallelize CI script to improve run time by joostjager · Pull Request #4357 · lightningdevkit/rust-lightning

joostjager · 2026-01-28T12:00:37Z

Split the CI build job in multiple parallel running parts. Longest job runs for ~32 mins.

ldk-reviews-bot · 2026-01-28T12:00:39Z

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

TheBlueMatt · 2026-01-28T12:17:58Z

ci/ci-tests.sh

 }

+# Initialize GitHub Actions Job Summary
+if [ -n "$GITHUB_STEP_SUMMARY" ]; then


Given this is just nice printing should we always turn it on rather than only for CI?

We can do that, but in which file would you want to save the output? The file is, during CI, in $GITHUB_STEP_SUMMARY

Oh I just skimmed it and thought it printed didn't write to a file.

Wasn't happy with the way github does the step summary. Now pushed a different approach that just splits up the workflow in more granular steps.

Bleh, no, ci-tests.sh exists so that we're not tied to github and so that people can run it locally. Splitting it into a million bits breaks those existing goals.

You can still run ci-tests.sh

codecov · 2026-01-28T15:16:33Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.90%. Comparing base (28e3335) to head (a026ace).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4357      +/-   ##
==========================================
- Coverage   86.06%   85.90%   -0.17%     
==========================================
  Files         156      156              
  Lines      103623   103958     +335     
  Branches   103623   103958     +335     
==========================================
+ Hits        89183    89303     +120     
- Misses      11924    12137     +213     
- Partials     2516     2518       +2

Flag	Coverage Δ
tests	`85.90% <ø> (-0.17%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

joostjager · 2026-01-29T14:46:00Z

Not happy with this yet, exploring other options.

TheBlueMatt

I'm still not really sold on this - we're gonna end up missing a step that we add in CI or something and is more logic in ci-tests.sh really helpful - how often are you actually looking at individual steps to understand the performance profile of ci-tests.sh?

joostjager · 2026-01-30T20:38:29Z

It's not so much about performance profile anymore. That was what I had initially. The current shape of it is to make it easy to see which test failed and jump to log files. It's friction that I experience with one big log file.

Adding things in CI is not something that happens very often. I think it is worth doing this change. If you add to CI and follow the pattern, it seems unlikely that something is forgotten.

ldk-reviews-bot · 2026-02-02T00:00:47Z

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

tnull

Concept ACK.

I like the idea as it makes it easier to just run a specific subset of tests/checks, while still maintaining backwards compat.

Breaking the CI tests into individual steps might also be a good first step before we move some of them into a nightly job eventually?

tnull · 2026-02-02T08:05:00Z

ci/ci-tests.sh

+workspace-tests
+ldk-upgrade-tests
+workspace-member-checks
+lightning-dnssec


Can we make the crate-related step naming consistent? Currently some are using just an abbreviated prefix (tx-sync, block-sync), while others like this spell out the full crate name. If we intend this to be used by devs regularly, would be good to make the step names predictable, so that we don't have to always look up what the exact step is called.

Yes, good one, done.

Also renamed step IDs to be action-oriented (e.g., test-lightning-block-sync instead of lightning-block-sync-tests), matching how the echo statements and GitHub step names already use verbs like "Testing..." or "Checking...".

TheBlueMatt · 2026-02-02T11:50:31Z

The current shape of it is to make it easy to see which test failed

But even with more discrete steps you're still scrolling up until you find the + ... line to see which command specifically failed? I'm not sure how this helps.

and jump to log files. It's friction that I experience with one big log file.

Doesn't this happen automatically (after github finishes loading....)

I'm really not happy seeing a laundry list of steps copied between the bash script and github config - these are inevitably going to get out of sync and we'll end up having steps skipped in CI.

joostjager · 2026-02-02T14:02:20Z

But even with more discrete steps you're still scrolling up until you find the + ... line to see which command specifically failed? I'm not sure how this helps.

I always find it hard to scroll up in the long log output to find the last command executed, and not scroll further up to an earlier command and incorrectly think that that is the failing one. I think the following is an improvement to that:

Doesn't this happen automatically (after github finishes loading....)

It jumps to the very end indeed, but I cannot see directly what it was doing.

I'm really not happy seeing a laundry list of steps copied between the bash script and github config - these are inevitably going to get out of sync and we'll end up having steps skipped in CI.

I may hope the set of tests is relatively stable, and getting out of sync is therefore very unlikely. But pushed a commit that does a verification at the end to catch it anyway.

TheBlueMatt · 2026-02-02T16:21:28Z

I always find it hard to scroll up in the long log output to find the last command executed, and not scroll further up to an earlier command and incorrectly think that that is the failing one

Right, I don't disagree, but this doesn't fix that? There are still multiple commands run in many steps and you still have to scoll up and find the one that failed. #4368 should improve things a bit, but if you actually want to fix this problem probably we need something in ci-tests.sh that simply catches errors and prints the command that errored.

joostjager · 2026-02-02T18:46:31Z

Agreed that #4368 improves it a bit, but I still think that also having these github-native sections is better. Even if they sometimes group multiple commands. ci-test.sh can print the command that failed, but then it's still required to scroll up to get to the start of the log.

joostjager · 2026-02-03T08:01:27Z

Worth noting that this also helps when waiting for CI to complete, which can take a while. With the current single-step approach there's no way to tell how far along a run is. You just see it running. With split steps you can see which step is currently executing, giving a sense of progress.

GitHub also shows timing per step, which makes it easier to identify where optimization efforts would have the most impact.

tnull · 2026-02-03T13:10:51Z

I'm really not happy seeing a laundry list of steps copied between the bash script and github config - these are inevitably going to get out of sync and we'll end up having steps skipped in CI.

Hmm, I see the point about things potentially getting out-of-sync at some point. No strong opinion either way, but even if we end up not doing the CI changes, can we at least keep the splittin of ci-tests.sh? That would be pretty handy for local testing, IMO.

joostjager · 2026-02-03T13:24:02Z

The out of sync risk is addressed in the extra commit that I pushed where completion of all steps is verified.

I also think that step selection can be useful in CI other than for reporting purposes. Currently our CI is painfully slow, and there are probably build matrix / step combinations that can for example be moved to a nightly job.

joostjager · 2026-02-04T10:14:02Z

Rebased onto --quiet change.

tnull

Generally still makes sense to me, especially the part that allows to locally run a sub-set of the CI script. Changes look good to me, maybe mod one nit.

Seems we might need to discuss this offline then to make progress here.

tnull · 2026-02-05T08:51:28Z

ci/ci-tests.sh

+
+# Log the completed step to GITHUB_ENV for the verification step
+if [ -n "$GITHUB_ENV" ]; then
+	echo "CI_COMPLETED_STEPS=${CI_COMPLETED_STEPS:-} $STEP" >> "$GITHUB_ENV"


nit: It's a bit weird to modify the external env variable here.

do you have a suggestion how to do it otherwise? We need to write the modified CI_COMPLETED_STEPS var to the github file in order for it to carry over to the next step, and ultimately to the verification step that checks that we didn't forget anything.

joostjager · 2026-02-07T08:17:32Z

The build matrix is now dynamic. A new ci-config job determines both the platform/toolchain matrix and a run_all flag. On regular PRs only core steps run on a single self-hosted/1.75.0 runner. Non-core steps are gated behind labels (block-sync, no-std, c-bindings, cfg-flags).

Adding the full-test label expands the matrix to all six platform/toolchain combinations and enables every test step, plus fuzz, coverage, and benchmark. Pushes to main behave the same as full-test. The workflow re-triggers on label changes so you can add full-test to an existing PR and get the full suite without re-pushing.

For branch protection, require build (windows-latest, stable) since that check only appears when the full matrix is active.

TheBlueMatt · 2026-02-07T16:55:12Z

.github/workflows/build.yml

+      - name: Determine matrix and scope
+        id: config
+        run: |
+          FULL_MATRIX='{"include":[{"platform":"self-hosted","toolchain":"stable"},{"platform":"self-hosted","toolchain":"beta"},{"platform":"self-hosted","toolchain":"1.75.0"},{"platform":"windows-latest","toolchain":"stable"},{"platform":"macos-latest","toolchain":"stable"},{"platform":"macos-latest","toolchain":"1.75.0"}]}'


Sheesh can we just have a separate yml file for nightly jobs?

Both concepts aren't mutually exclusive. I thought just doing limited scope testing for PRs that are in progress is useful too. Of course if we move the whole build matrix to the nightly, this is unnecessary, yes.

TheBlueMatt · 2026-02-07T16:56:00Z

.github/workflows/build.yml

+      - name: Check and build docs for workspace members
+        shell: bash
+        run: CI_ENV=1 CI_MINIMIZE_DISK_USAGE=1 ./ci/ci-tests.sh check-workspace-members
+      # --- Label-gated: block-sync ---


Looks like these are currently not being run on the normal build.

This PR demonstrates using labels to select CI test subsets, with a full run gated on merge.

A normal PR push only runs core steps. The other groups (block-sync, no-std, c-bindings, cfg-flags) are skipped unless you add the corresponding label. What is a "normal run"? Intentionally just the core. The idea is to deload CI for everyday pushes. Before merging, you add the full-test label to run the complete suite.

TheBlueMatt · 2026-02-07T17:00:24Z

ci/ci-tests.sh

 WORKSPACE_MEMBERS=( $(cat Cargo.toml | tr '\n' '\r' | sed 's/\r    //g' | tr '\r' '\n' | grep '^members =' | sed 's/members.*=.*\[//' | tr -d '"' | tr ',' ' ') )

+# Verify that all steps were executed (called at the end of CI)
+if [ "$1" = "--verify-complete" ]; then


Meh come on, originally the stated purpose of this PR was to make it easier to find out what failed. It didn't really accomplish that and after the --quiet change (well, and a few more warning fixups that we need to do?) this isn't really an issue anymore (you do need to scroll from the top, but the entire CI run is only a few pages of output so that's fast). Now we have a bunch of complexity to validate steps are all run and for what?

Now we want to see when CI passes...something? Personally I run ci-tests.sh locally and just wait for the first cargo test to complete, which is quite obvious from the output so I'm not really sure what the goal here is.

I'm also pretty dubious the "we want to split CI for local testing" goal is accomplished - the steps here are clearly not conducive to that, if the goal is local subset testing we should probably group tests by useful tags (not just individual steps) so that we can do things like "run all tests testing no-std" or "run all tests on lightning-transaction-sync", even though there's overlap of those. But more generally I'm not quite sure what the exact use-case here is - what do we want to accomplish that cargo test -p lightning-transaction-sync doesn't?

Fair points on the earlier motivations. --quiet addresses most of the failure-finding friction, and local subset testing isn't the main driver.

The remaining motivation is CI load. Every push currently runs the full suite across all matrix combinations. The step splitting enables gating: core tests on every push, heavier groups only when relevant via labels, full suite required pre-merge via full-test. The --verify-complete step catches any drift between the script and the workflow.

If a separate nightly yml is preferred for the heavier jobs, that works too, but then we lose the ability to run them on-demand for a specific PR via labels.

To be clear, this PR has become exploratory at this point. It was a concrete merge proposal when it just split steps, but has evolved into a proof of concept for label-gated CI.

joostjager · 2026-02-12T09:43:19Z

Abandoned the granular steps approach. Since all steps need to run for every PR anyway, the real win is parallelism, not granularity.

This force-push replaces the previous approach with 4 parallel jobs via a reusable workflow. Wall-clock time should go down from ~150 min to ~49 min.

First run timings for the build-* jobs:

Job	Duration
build-bindings	49m 50s
build-workspace	40m 37s
build-sync	28m 05s
build-cfg-flags	26m 27s

joostjager · 2026-02-13T13:39:21Z

Rebalanced jobs a bit. Longest job now ~32 mins

TheBlueMatt · 2026-02-16T14:48:12Z

.github/workflows/build.yml

+      setup-bitcoind: true
+      ci-env: true
+
+  build-complete:


What's the point of this? Won't the individual jobs report errors?

This was just to have a single gating condition. Will remove.

TheBlueMatt · 2026-02-16T14:48:56Z

.github/workflows/ci-build.yml

+      install-arm-target:
+        description: Whether to install ARM embedded target
+        type: boolean
+        default: false
+      setup-bitcoind:
+        description: Whether to set up bitcoind/electrs
+        type: boolean
+        default: false
+      ci-env:
+        description: Whether to set CI_ENV=1
+        type: boolean
+        default: false


Why not just always do these things?

To save time if a sub CI script doesn't need it.

Remove ci-env, unnecessary

It should only take a few seconds to load the bitcoind/electrs files from cache/remote host? Can we just avoid the complexity?

https://github.com/lightningdevkit/rust-lightning/compare/ee1c9330ec855e6cca41bbacf1035138d9ddc2ab..a026acef4f52da63265d1d2bcbfee59af2a0394d

ci/ci-tests-common.sh

Shellcheck is a static analysis tool for shell scripts and belongs with the other linting checks rather than in the build matrix. This also prepares for splitting the build job into parallel sub-jobs, where running shellcheck in each sub-job would be redundant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joostjager · 2026-02-18T13:17:39Z

Moved the shellcheck step from the build job to the linting job in a preparatory commit, since it doesn't pertain to any particular sub-script and fits better alongside the other lint checks.

The previous 4-job split had a significant imbalance (22m to 51m). This 6-job split groups steps by logical concern rather than trying to balance by time, and still achieves a better range (20m to 37m) because the two heaviest jobs got split along natural seams.

Job timings from the latest CI run:

Job	Time
build-workspace	20m
build-nostd	22m
build-sync	27m
build-bindings	27m
build-cfg-flags	30m
build-features	37m

Total time across all jobs is 163m vs ~144m for the original monolithic script, so about 13% overhead from duplicated setup/compilation.

Verified correctness by concatenating all sub-scripts (stripping per-file boilerplate) into a single file and diffing against the original script. Since steps moved between groups, the diff initially shows many moved blocks. To confirm nothing was lost or added, I manually reordered the blocks in the concatenated file to match the original order and verified the diff was empty.

Open question: with 6 jobs each doing a fresh build, the CI_MINIMIZE_DISK_USAGE cargo cleans between steps may no longer be worth the time cost. Also wondering whether we should set up some form of cargo build caching (e.g. actions/cache on the target directory) for the self-hosted runners, since each job now independently compiles the full dependency tree from scratch.

TheBlueMatt

Aside from the comment about just re-fetching bitcoind this LGTM. I don't see a reason to avoid the 10-second cost for more complexity in CI scripts and jobs.

ldk-reviews-bot · 2026-02-18T15:37:15Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Split the monolithic ci-tests.sh into six focused scripts that run as separate parallel jobs via a reusable workflow (ci-build.yml): - ci-tests-workspace.sh: workspace checks, tests, docs, and downstream compat - ci-tests-features.sh: crate feature/flag combinations (dnssec, tokio, backtrace, test vectors, serde) - ci-tests-bindings.sh: c_bindings builds and tests - ci-tests-nostd.sh: no_std builds and compatibility checks - ci-tests-cfg-flags.sh: experimental gated cfg flags (taproot, simple_close, lsps1_service, peer_storage) - ci-tests-sync.sh: block sync and transaction sync clients Shared setup (MSRV pins, backtrace) is extracted into ci-tests-common.sh. The ci-tests.sh script now delegates to the sub-scripts for local use. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

TheBlueMatt

We'll have to monitor total usage to see if it creeps back up and we have to revert this to reduce load, but easy enough for now.

joostjager changed the title ~~ci: Add GitHub Job Summary reporting to ci-tests.sh~~ Add GitHub Job Summary reporting to ci-tests.sh Jan 28, 2026

TheBlueMatt reviewed Jan 28, 2026

View reviewed changes

joostjager force-pushed the ci-test-summary branch from 13792ed to 03aa9f1 Compare January 28, 2026 12:20

joostjager force-pushed the ci-test-summary branch from 4f74e2d to 6f524c7 Compare January 29, 2026 15:44

joostjager changed the title ~~Add GitHub Job Summary reporting to ci-tests.sh~~ Split CI script into separate GitHub Actions steps Jan 29, 2026

joostjager requested a review from TheBlueMatt January 30, 2026 08:19

joostjager marked this pull request as ready for review January 30, 2026 11:09

joostjager requested a review from tnull January 30, 2026 11:09

TheBlueMatt reviewed Jan 30, 2026

View reviewed changes

tnull reviewed Feb 2, 2026

View reviewed changes

joostjager force-pushed the ci-test-summary branch 2 times, most recently from e08a604 to 8a620a5 Compare February 2, 2026 08:35

joostjager mentioned this pull request Feb 3, 2026

Use --quiet not --verbose on cargo calls in ci-tests.sh #4368

Merged

joostjager force-pushed the ci-test-summary branch from c5aa8da to d566539 Compare February 4, 2026 09:24

joostjager requested a review from tnull February 5, 2026 08:17

tnull reviewed Feb 5, 2026

View reviewed changes

joostjager force-pushed the ci-test-summary branch from 6d79246 to ac73165 Compare February 7, 2026 08:12

joostjager added the full-test label Feb 7, 2026

joostjager removed the full-test label Feb 7, 2026

joostjager force-pushed the ci-test-summary branch from ac73165 to e1385d1 Compare February 7, 2026 08:19

TheBlueMatt reviewed Feb 7, 2026

View reviewed changes

joostjager added full-test and removed full-test labels Feb 8, 2026

joostjager marked this pull request as draft February 12, 2026 08:42

joostjager force-pushed the ci-test-summary branch from e1385d1 to c011c0b Compare February 12, 2026 09:41

joostjager removed the full-test label Feb 12, 2026

joostjager force-pushed the ci-test-summary branch from c011c0b to c2c0315 Compare February 12, 2026 10:54

joostjager changed the title ~~Split CI script into separate GitHub Actions steps~~ Parallelize CI script to improve run time Feb 12, 2026

joostjager added this to Weekly Goals Feb 12, 2026

joostjager self-assigned this Feb 12, 2026

TheBlueMatt reviewed Feb 16, 2026

View reviewed changes

joostjager force-pushed the ci-test-summary branch 2 times, most recently from 55cda17 to ee1c933 Compare February 18, 2026 10:36

joostjager marked this pull request as ready for review February 18, 2026 13:21

joostjager requested a review from TheBlueMatt February 18, 2026 13:21

TheBlueMatt reviewed Feb 18, 2026

View reviewed changes

joostjager force-pushed the ci-test-summary branch from ee1c933 to a026ace Compare February 18, 2026 15:57

TheBlueMatt approved these changes Feb 18, 2026

View reviewed changes

joostjager requested a review from tnull February 18, 2026 19:16

Conversation

joostjager commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joostjager Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

joostjager commented Jan 29, 2026

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

joostjager commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Feb 2, 2026

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Feb 2, 2026

Uh oh!

joostjager commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Feb 2, 2026

Uh oh!

joostjager commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager commented Feb 3, 2026

Uh oh!

tnull commented Feb 3, 2026

Uh oh!

joostjager commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager commented Feb 4, 2026

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joostjager Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joostjager commented Feb 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joostjager Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

joostjager commented Jan 28, 2026 •

edited

Loading

ldk-reviews-bot commented Jan 28, 2026 •

edited

Loading

joostjager Jan 29, 2026 •

edited

Loading

codecov bot commented Jan 28, 2026 •

edited

Loading

joostjager commented Jan 30, 2026 •

edited

Loading

joostjager commented Feb 2, 2026 •

edited

Loading

joostjager commented Feb 2, 2026 •

edited

Loading

joostjager commented Feb 3, 2026 •

edited

Loading

joostjager Feb 5, 2026 •

edited

Loading

joostjager Feb 9, 2026 •

edited

Loading

joostjager commented Feb 12, 2026 •

edited

Loading

joostjager commented Feb 18, 2026 •

edited

Loading